Model Selection

Multimodal CLIP Architecture

# Multimodal CLIP Architecture

Vit Base Patch16 Clip 224.laion400m E32

Vision Transformer model trained on the LAION-400M dataset, compatible with both open_clip and timm frameworks

Image Classification

Vit Base Patch32 Clip 224.laion400m E31

Vision Transformer model trained on the LAION-400M dataset, compatible with both OpenCLIP and timm frameworks

Image Classification

Biomedclip Vit Bert Hf

A BiomedCLIP model implemented based on PyTorch and Huggingface frameworks, reproducing the original microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224 model

Multimodal Fusion

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase